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Abstract 



This paper introduces a unified framework for the detection of a single source with a sensor array in 
the context where the noise variance and the channel between the source and the sensors are unknown at 
the receiver The Generalized Maximum Likelihood Test is studied and yields the analysis of the ratio 
between the maximum eigenvalue of the sampled covariance matrix and its normalized trace. Using 
recent results from random matrix theory, a practical way to evaluate the threshold and the p-value of 
the test is provided in the asymptotic regime where the number K of sensors and the number N of 
observations per sensor are large but have the same order of magnitude. The theoretical performance of 
the test is then analyzed in terms of Receiver Operating Characteristic (ROC) curve. It is in particular 
proved that both Type I and Type II error probabilities converge to zero exponentially as the dimensions 
increase at the same rate, and closed-form expressions are provided for the error exponents. These 
theoretical results rely on a precise description of the large deviations of the largest eigenvalue of spiked 
random matrix models, and establish that the presented test asymptotically outperforms the popular test 
based on the condition number of the sampled covariance matrix. 
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I. Introduction 

The detection of a source by a sensor array is at the heart of many wireless applications. It is 
of particular interest in the realm of cognitive radio [[U, |l2l where a multi-sensor cognitive device 
(or a collaborative networlsQ) needs to discover or sense by itself the surrounding environment. 
This allows the cognitive device to make relevant choices in terms of information to feed back, 
bandwidth to occupy or transmission power to use. When the cognitive device is switched on, its 
prior knowledge (on the noise variance for example) is very limited and can rarely be estimated 
prior to the reception of data. This unfortunately rules out classical techniques based on energy 
detection H, [|5l, and requires new sophisticated techniques exploiting the space or spectrum 
dimension. 

In our setting, the aim of the multi-sensor cognitive detection phase is to construct and analyze 
tests associated with the following hypothesis testing problem: 

{w{n) under Hq 

for n = : - 1 , (1) 
h s{n) + w{n) under Hi 

where y{n) = [yi{n), . . . ,yK{n)]'^ is the observed K x 1 complex time series, w(n) represents 
a K X 1 complex circular Gaussian white noise process with unknown variance cr^, and N 
represents the number of received samples. Vector h E C^^^^ is a deterministic vector and 
typically represents the propagation channel between the source and the K sensors. Signal 
s{n) denotes a standard scalar independent and identically distributed (i.i.d.) circular complex 
Gaussian process with respect to the samples n = : — 1 and stands for the source signal to 
be detected. 

The standard case where the propagation channel and the noise variance are known has been 
thoroughly studied in the literature in the Single Input Single Output case 01, [|5]|, B and 
Multi-Input Multi-Ouput |7j case. In this simple context, the most natural approach to detect the 
presence of source s(n) is the well-known Neyman-Pearson (NP) procedure which consists in 
rejecting the null hypothesis when the observed likelihood ratio lies above a certain threshold 
(HI. Traditionally, the value of the threshold is set in such a way that the Probability of False 
Alarm (PFA) is no larger than a predefined level a E (0, 1). Recall that the PFA (resp. the miss 

'The collaborative network corresponds to multiple base stations connected, in a wireless or wired manner, to form a virtual 
antenna svstemlsl. 
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probability) of a test is defined as the probability that the receiver decides hypothesis Hi (resp. 
Hq) when the true hypothesis is Hq (resp. Hi). The NP test is known to be uniformly most 
powerful i.e., for any level a E (0, 1), the NP test has the minimum achievable miss probability 
(or equivalently the maximum achievable power) among all tests of level a. In this paper, we 
assume on the opposite that: 

• the noise variance a"^ is unknown, 

• vector h is unknown. 

In this context, probability density functions of the observations y{n) under both Hq and Hi 
are unknown, and the classical NP approach can no longer be employed. As a consequence, the 
construction of relevant tests for ([T]) together with the analysis fo their perfomances is a crucial 
issue. The classical approach followed in this paper consists in replacing the unknown parameters 
by their maximum likelihood estimates. This leads to the so-called Generalized Likelihood Ratio 
(GLR). The Generalized Likelihood Ratio Test (GLRT), which rejects the null hypothesis for large 
values of the GLR, easily reduces to the statistics given by the ratio of the largest eigenvalue of 
the sampled covariance matrix with its normalized trace, cf. lIH, ifTOl . [fTT]|. Nearby statistics [[T2l|. 
lfT3l . lfT4l . ifTSl . with good practical properties, have also been developed, but would not yield 
a different (asymptotic) error exponent analysis. 

In this paper, we analyze the performance of the GLRT in the asymptotic regime where the 
number K of sensors and the number N of observations per sensor are large but have the same 
order of magnitude. This assumption is relevant in many applications, among which cognitive 
radio for instance, and casts the problem into a large random matrix framework. 

Large random matrix theory has already been applied to signal detection \\6\ (see also ifTTl ). 
and recently to hypothesis testing [[T5l . [[TSl . [[T9l . In this article, the focus is mainly devoted to 
the study of the largest eigenvalue of the sampled covariance matrix, whose behaviour changes 
under Hq or Hi. The fluctuations of the largest eigenvalue under Hq have been described by 
Johnstone EOl by means of the celebrated Tracy-Widom distribution, and are used to study the 
threshold and the p-value of the GLRT. 

In order to characterize the performance of the test, a natural approach would have been to 
evaluate the Receiver Operating Characteristic (ROC) curve of the GLRT, that is to plot the 
power of the test versus a given level of confidence. Unfortunately, the ROC curve does not 
admit any simple closed-form expression for a finite number of sensors and snapshots. As the 
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miss probability of the GLRT goes exponentially fast to zero, the performance of the GLRT 
is analyzed via the computation of its error exponent, which caracterizes the speed of decrease 
to zero. Its computation relies on the study of the large deviations of the largest eigenvalue 
of 'spiked' sampled covariance matrix. By 'spiked' we refer to the case where the eigenvalue 
converges outside the bulk of the limiting spectral distribution, which precisely happens under 
hypothesis Hi. We build upon [21] to establish the large deviation principle, and provide a 
closed-form expression for the rate function. 

We also introduce the error exponent curve, and plot the error exponent of the power of the 
test versus the error exponent for a given level of confidence. The error exponent curve can 
be interpreted as an asymptotic version of the ROC curve in a log-log scale and enables us to 
establish that the GLRT outperforms another test based on the condition number, and proposed 
by (221, Il23l, (241 in the context of cognitive radio. 

Notice that the results provided here (determination of the threshold of the GLRT test and the 
computation of the error exponents) would still hold within the setting of real Gaussian random 
variables instead of complex ones, with minor modification 

The paper is organized as follows. 

Section HI] introduces the GLRT. The value of the threshold, which completes the definition 
of the GLRT, is established in Section IH-BI As the latter threshold has no simple closed-form 
expression and as its practical evaluation is difficult, we introduce in Section IH-CI an asymptotic 
framework where it is assumed that both the number of sensors K and the number N of available 
snapshots go to infinity at the same rate. This assumption is valid for instance in cognitive radio 
contexts and yields a very simple evaluation of the threshold, which is important in real-time 
applications. 

In Section Ulll we recall several results of large random matrix theory, among which the 
asymptotic fluctuations of the largest eigenvalue of a sample covariance matrix, and the limit of 
the largest eigenvalue of a spiked model. 

These results are used in Section |IV] where an approximate threshold value is derived, which 
leads to the same PFA as the optimal one in the asymptotic regime. This analysis yields a 
relevant practical method to approximate the p-values associated with the GLRT. 

^Details are provided in Remarks |4] and |9] 
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Section |V] is devoted to the performance analysis of the GLRT. We compute the error exponent 
of the GLRT, derive its expression in closed-form by establishing a Large Deviation Principle 
for the test statistic T/v B and describe the error exponent curve. 

Section |VI] introduces the test based on the condition number, that is the statistics given by 
the ratio between the largest eigenvalue and the smallest eigenvalue of the sampled covariance 
matrix. We provide the error exponent curve associated with this test and prove that the latter 
is outperformed by the GLRT. 

Section rvTll provides further numerical illustrations and conclusions are drawn in Section IVIIII 

Mathematical details are provided in the Appendix. In particular, a full rigorous proof of a 
large deviation principle is provided in Appendix El while a more informal proof of a nearby 
large deviation principle, maybe more accessible to the non-specialist, is provided in Appendix 

E 



Notations 

For i E {0, 1}, Pi(£) represents the probability of a given event £ under hypothesis Hi. For 
any real random variable T and any real number 7, notation 



stands for the test function which rejects the null hypothesis when T > 7. In this case, the 
probability of false alarm (PFA) of the test is given by Po(T > 7), while the power of the test is 
Pi(T > 7). Notation stands for the almost sure (a.s.) convergence under hypothesis Hi. For 

Hi 

any one-to-one mapping T : X where X and ^ are two sets, we denote by the inverse 
of T w.r.t. composition. For any borel set A G M, a; t-)- 1a{x) denotes the indicator function of 
set A and ||a:;|| denotes the Euclidian norm of a given vector as. If A is a given matrix, denote 
by its transpose-conjugate. If F is a cumulative distribution function (c.d.f.), we denote by 
F is complementary c.d.f., that is: F = 1 — F. 

^Note that in recent papers 1251 . 1141 . II15I . the fluctuations of the test statistics under Hi, based on large random matrix 
techniques, have also been used to approximate the power of the test. We believe that the performance analysis based on the 
error exponent approach, although more involved, has a wider range of validity. 
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II. Generalized Likelihood Ratio Test 

In this section, we derive the Generalized Likelihood Ratio Test (section III-AI) and compute 
the associated threshold and p-value (section III-BI) . This exact computation raises some compu- 
tational issues, which are circumvented by the introduction of a relevant asymptotic framework, 
well-suited for mathematical analysis (Section Hl-CI) . 

A. Derivation of the Test 

Denote by N the number of observed samples and recall that: 

{w{n) under Hq 

n = 0:N-l, 
h s{n) + w{n) under Hi 

where {w(n),0 < n < N — 1) represents an independent and identically distributed (i.i.d.) 

process of K x 1 vectors with circular complex Gaussian entries with mean zero and covariance 

matrix ct^Ia-, vector h G C^^^^ is deterministic, signal (s(ra),0 < n < N — 1) denotes a 

scalar i.i.d. circular complex Gaussian process with zero mean and unit variance. Moreover, 

{w{n), < n < N — 1) and < n < — 1) are assumed to be independent processes. 

We stack the observed data into a K x N matrix Y = [y{0), . . . , y{N — 1)]. Denote by R the 

sampled covariance matrix: 

and respectively, by po(Y] a^) and pi{Y; h, cr^) the likelihood functions of the observation matrix 
Y indexed by the unknown parameters h and a"^ under hypotheses Hq and Hi. 

As Y is a K X N matrix whose columns are i.i.d. Gaussian vectors with covariance matrix 
S defined by: 

(j'^Ik under Ho 

(2) 

hh^ + a'^lK under Hi 



the likelihood functions write: 



2^ 



Po(Y;a2) = (7ra2)-^^exp(--trR) , (3) 



Pi{Y]h,a^) = {7T^det{hh^ + aHK))-^exp(^-NtT{K{hh^ + aHK)~^)'^ ■ (4) 

In the case where parameters h and cr^ are available, the celebrated Neyman-Pearson procedure 
yields a uniformly most powerful test, given by the likelihood ratio statistics '^^^^^Sy^- 
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However, in the case where h and are unknown, which is the problem addressed here, no 
simple procedure garantees a uniformly most powerful test, and a classical approach consists in 
computing the GLR: 

J^N - 7TT • (J) 

sup^2 po{Y;a-) 

In the GLRT procedure, one rejects hypothesis Hq whenever > ^n, where is a certain 
threshold which is selected in order that the PFA Fq(Ln > ^n) does not exceed a given level 
a. 

In the following proposition, which follows after straightforward computations from f26\ and 
BH, we derive the closed form expression of the GLR L^. Denote by Ai > A2 > ■ ■ ■ > Aa' > 
the ordered eigenvalues of R (all distincts with probability one). 



Proposition 1. Let be defined by: 



Tn = , (6) 



then, the GLR (cf. Eq. ©j writes: 

C 



L 



N 



iT^r (1 - ^) 



where C = [l - j^) 



1^{1-K)N 



By Proposition ^ Ln = <Pn,k{Tn) where (pN,K ■ x ^ Cx'^ (l - f )^^^ ^\ The GLRT 
rejects the null hypothesis when inequality Ln > ^n holds. As Tn E {I, K) with probability one 
and as (f)N,K is increasing on this interval, the latter inequality is equivalent to Tn > 
Otherwise stated, the GLRT reduces to the test which rejects the null hypothesis for large values 
of T^: 

Hi 

Tn ^ 7Af (7) 
Ho 

where 77V = </'7vV('CAf) is a certain threshold which is such that the PFA does not exceed a given 
level a. In the sequel, we will therefore focus on the test statistics Tn- 

Remark 1. There exist several variants of the above statistics / [72]/ . / [73]/ . / I7?l/ . / [75l/ . which 
merely consist in replacing the normalized trace with a more involved estimate of the noise 
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variance. Although very important from a practical point of view, these variants have no impact 
on the (asymptotic) error exponent analysis. Therefore, we restrict our analysis to the traditional 
GLRT for the sake of simplicity. 

B. Exact threshold and p-values 

In order to complete the construction of the test, we must provide a procedure to set the 
threshold 7Ar. As usual, we propose to define as the value which maximizes the power 
Pi (Tat > 7Ar) of the test ^ while keeping the PFA Po(T7v > ^n) under a desired level a E (0, 1). 
It is well-known (see for instance [|8l, lITTll ') that the latter threshold is obtained by: 

7iv = P^^(a) (8) 
where PAr(t) represents the complementary c.d.f. of the statistics T^r under the null hypothesis: 

PM{t) = Po(T7v > t) . (9) 

Note that PAr(t) is continuous and decreasing from 1 to on t G [0, oo), so that the threshold 
pjj^ia) in ([8]) is always well defined. When the threshold is fixed to 'Jn = Pi^{ol), the GLRT 
rejects the null hypothesis when T/v > P^{ol) or equivalently, when Pn{Tn) < ol. It is usually 
convenient to rewrite the GLRT under the following form: 

Pn{Tn) ^ a. (10) 
Hi 

The statistics pAr(T/v) represents the significance probability or p- value of the test. The null 
hypothesis is rejected when the p-value Pn{Tn) is below the level a. In practice, the computation 
of the p-value associated with one experiment is of prime importance. Indeed, the p-value not 
only allows to accept/reject an hypothesis by (flOl) . but it furthermore reflects how strongly the 
data contradicts the null hypothesis |[8l. 

In order to evaluate p-values, we derive in the sequel the exact expression of the complementary 
c.d.f. pn. The crucial point is that T/v is a function of the eigenvalues Ai, . . . , Aa' of the sampled 
covariance matrix R. We have 

PNit) = p'^j^ ^{Xi,- ■ ■ ,XK)dXi.,K (11) 
J At 
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where for each t, the domain of integration Aj is defined by: 

[xi,...,xk) eR"", — ■ >t 

xi H \- xk 

and is the joint probability density function (p.d.f.) of the ordered eigenvalues of R under 
Hq given by: 

p'kA^i:k) = n i^.-^^rfl^r^'e-^^^ (12) 

l<i<j<K j=l 

where ^(xi>->xk>o) stands for the indicator function of the set {(xi . . . xk) '■ Xi > ■ ■ ■ > xk > 
0} and where Zj^j^ is the normalization constant (see for instance ll28l . ||29l Chapter 4]). 

Remark 2. For each t, the computation of pN{t) requires the numerical evaluation of a non- 
trivial integral. Despite the fact that powerful numerical methods, based on representations of 
such integrals with hypergeometric functions / liOl/ . are available ( see for instance 071/ . / Ii2l/ ), 
an on line computation, requested in a number of real-time applications, may be out of reach. 
Instead, tables of the function p^ should be computed off line i.e., prior to the experiment. 



As both the dimensions K and N may be subject to frequent changei\ all possible tables of 
the function pjq should be available at the detector's side, for all possible values of the couple 
[N, K). This both requires substantial computations and considerable memory space. In what 
follows, we propose a way to overcome this issue. 

In the sequel, we study the asymptotic behaviour of the complementary c.d.f. when both 
the number of sensors K and the number of snapshots N go to infinity at the same rate. This 
analysis leads to simpler testing procedure. 

C. Asymptotic framework 

We propose to analyze the asymptotic behaviour of the complementary c.d.f. j9jv as the number 
of observations goes to infinity. More precisely, we consider the case where both the number K 
of sensors and the number N of snapshots go to infinity at the same speed, as assumed below 

K 

N oo, K oo, ^ "^'^^^ < c < 1. (13) 

''in cognitive radio applications for instance, the number of users K which are connected to the network is frequently varying. 
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This asymptotic regime is relevant in cases where the sensing system must be able to perform 
source detection in a moderate amount of time i.e., the number K of sensors and the number N 
of samples being of the same order. This is in particular the case in cognitive radio applications 
(see for instance [331). Very often, the number of sensors is lower than the number of snapshots, 
hence the ratio c lower than 1. 

In the sequel, we will simply denote N, K ^ oo to refer to the asymptotic regime (fT3l) . 

Remark 3. The results related to the GLRT presented in Sections 170 and [H remain true for 
c > 1; in the case of the test based on the condition number and presented in Section Wl\ extra- 
work is needed to handle the fact that the lowest eigenvalue converges to zero, which happens 
ifc>l. 

111. Large random matrices - Largest eigenvalue - Behaviour of the GLR 

statistics 

In this section, we recall a few facts on large random matrices as the dimensions A^, K go to 
infinity. We focus on the behaviour of the eigenvalues of R which differs whether hypothesis 
Ho holds (Section IIILAl) or Hi holds (Section |IILB]). 

As the column vectors of Y are i.i.d. complex Gaussian with covariance matrix S given by 
(l2l), the probability density of R is given by: 

I e~^*'^(^"''^Vdet R)^-^ 

where Z{N, K, S) is a normalizing constant. 

A. Behaviour under hypothesis Hq 

As the behaviour of T/v does not depend on cr^, we assume that cr^ = 1; in particular, H = Ik- 
Under Hq, matrix R is a complex Wishart matrix and it is well-known (see for instance [jlSl ) that 
the Jacobian of the transformation between the entries of the matrix and the eigenvalues/angles 
is given by the Vandermonde determinant Y[i<i<j<K(^j This yields the joint p.d.f. of the 

ordered eigenvalues (fT2l) where the normalizing constant Z{N, K, Ik) is denoted by at for 
simplicity. 

The celebrated result from Marcenko and Pastur ll34ll states that the limit as A^, — oo of 



the c.d.f. F/v(x) = associated to the empirical distribution of the eigenvalues (Aj) of 
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R is equal to Pj^p where P^p represents the Marcenko-Pastur distribution: 

PMp(rfy) = l(A-,A+)(y)^^ ~J^^^~^'^ ciy, (14) 

with A"^ = (1 + yc)^ and = (1 — -\/c)^. This convergence is very fast in the sense that the 
probability of deviating from Pj^p decreases as e-^^x™nst. ^q^q precisely, a simple application 
of the large deviations results in ll35l yields that for any distance d on the set of probability 
measures on M compatible with the weak convergence and for any 5 > 0, 

limsup4logFoKFjv,PMp) > 6) = -oo . (15) 

Moreover, the largest eigenvalue Ai of R converges a.s. to the right edge of the Marcenko- 
Pastur distribution, that is (1 + A further result due to Johnstone ll20l describes its speed 
of convergence (A^~^/'^) and its fluctuations (see also [[36ll for complementary results). Let Ai 
be defined by: 

A, = iV^' (^'-'V^'^) . (16) 

where &Ar is defined by 



bN J 

bM := (1 + v^) i^--j= + 1 j , (17) 

then Ai converges in distribution toward a standard Tracy-Widom random variable with c.d.f. 
Ftw defined by: 

Ftw{x) = exp j {u- x)q'^{u)dt?j Va; G M , (18) 
where q solves the Painleve II differential equation: 

q"{x) = xq{x) + 2g'^(x), q{x) ~ Ai(x) as x oo 

and where Ai(x) denotes the Airy function. In particular, Ftw is continuous. The Tracy-Widom 
distribution was first introduced in ||371 . ||38l as the asymptotic distribution of the centered and 
rescaled largest eigenvalue of a matrix from the Gaussian Unitary Ensemble. 

Tables of the Tracy-Widom law are available for instance in ||39l , while a practical algorithm 
allowing to efficiently evaluate equation (fTSi) can be found in [|40l . 



Remark 4. In the case where the entries of matrix Y are real Gaussian random variables, the 
fluctuations of the largest eigenvalue are still described by a Tracy-Widom distribution whose 
definition slightly differs from the one given in the complex case (for details, see / [2Ql/ ). 
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B. Behaviour under hypothesis Hi 

In this case, the covariance matrix writes S = o'^Ik + hh* and matrix R follows a single 
spiked model. Since the behaviour of Tjv is not affected if the entries of Y are multiplied by a 
given constant, we find it convenient to consider the model where S = + Denote by 




the signal-to-noise ratio (SNR), then matrix S admits the decomposition S = UDU* where U 
is a unitary matrix and D = diag {pK-, 1, • • • , 1) • With the same change of variables from the 
entries of the matrix to the eigenvalues/angles with Jacobian ni<i<j<i^(^j ^ ^iYi the p.d.f. of 
the ordered eigenvalues writes: 

l<i<j<K j=l ^ ^ 

where the normalizing constant Z{N, K,Ik + hh*) is denoted by Z}^ for simplicity, X^- is 
the diagonal matrix with eigenvalues {xi, . . . , xk), is the K x K diagonal matrix with 
eigenvalues (j^^, 0, . . . , 0), and for any real diagonal matrices Ck, Di^, the spherical integral 
-^i^(Cx, D_ft:) is defined as 

Ik{Ck,'Dk) = I e^*^(c-Q°-Q")rfm;,(Q), (20) 

with rriK the Haar measure on the unitary group of size K (see ll30l Chapter 3] for details). 

Whereas this rank-one perturbation does not affect the asymptotic behaviour of (the 
convergence toward and the deviations of the empirical measure given by (fT5l) still hold 
under Pi), the limiting behaviour of the largest eigenvalue Ai can change if the signal-to-noise 
ratio Pk is large enough. 

Assumption 1. The following constant p G M exists: 

\\hf ( \ 
p= lim — — = lim . (21) 

We refer to p as the limiting SNR. We also introduce 
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Under hypothesis Hi, the largest eigenvalue has the following asymptotic behaviour as A^, K go 
to infinity: 



A^i. if P> ^/c, 

>^i^\ (22) 
^ A"*" otherwise. 



see for instance |4l| for a proof of this result. Note in particular that A^j^ is strictly larger than 
the right edge of the support A+ whenever p > -\/c. Otherwise stated, if the perturbation is large 
enough, the largest eigenvalue converges outside the support of Marcenko-Pastur distribution. 

C. Limiting behaviour of Tj^r under Hq and Hi 

Gathering the results recalled in Sections IIII-AI and IIII-B[ we obtain the following: 

Proposition 2. Let Assumption [7] hold true and assume that p > ^Jc, then: 

T;v ^ (1 + ^cf and ^ (1 + p) ( 1 + - ) asN.K^oo. 
Ho ^ Hi \ p J 

IV. Asymptotic threshold and p- values 

A. Computation of the asymptotic threshold and p-value 

In Theorem \T\ below, we take advantage of the convergence results of the largest eigenvalue 
of R under Hq in the asymptotic regime N, K ^ oo to express the threshold and the p-value 

K 



of interest in terms of Tracy- Widom quantiles. Recall that Ftw = 1 ~ F^w^ that cat = ^, and 



that bj\f is given by (fTTl) . 

Theorem 1. Consider a fixed level a G (0, 1) and let '~fN be the threshold for which the power 
of test (O is maximum, i.e. Pn^In) = « where pn is defined by (fTTl) . Then: 
1) The following convergence holds true: 



A iV2/3 

2) The PFA of the following test 



Cn = -;—{lN-{l + V^f) — y F^wi(^) 

Oat N,K—>-oo 



Hi 

Tn ^ a + V^f + ^ FTwi») (23) 
Ha 



converges to a. 
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3) The p-value pN{T]\r) associated with the GLRT can be approximated by: 

Pn{Tn) = Ftw ( 1 (24) 

in the sense that pn{Tn) — Pn{Tn) — > 0. 

Remark 5. Theorem [7] provides a simple approach to compute both the threshold and the p- 
values of the GLRT as the dimension K of the observed time series and the number N of 
snapshots are large: The threshold •jn associated with the level a can be approximated by the 
righthand side of (1221). Similarly, equation H24\l provides a convenient approximation for the p- 
value associated with one experiment. These approaches do not require the tedious computation 
of the exact complementary c.d.f. ( 1771) and, instead, only rely on tables of the c.d.f. Ftw, which 
can be found for instance in / l39l/ along with more details on the computational aspects ( note 
that function Ftw does not depend on any of the problem 's characteristic, and in particular 
not on c). This is of importance in real-time applications, such as cognitive radio for instance, 
where the users connected to the network must quickly decide for the presence/absence of a 
source. 

Proof of Theorem [7} Before proving the three points of the theorem, we first describe the 
fluctuations of Tjv under Hq with the help of the results in Section IIII-A[ Assume without loss 
of generality that cr^ = 1, recall that T^r = and denote by: 

^ N'/\T^ - jl + ^f) 

■^N = 7 y'^'') 

On 

the rescaled and centered version of the statistics T^. A direct application of Slutsky's lemma 
(see for instance [|42l) together with the fluctuations of Ai as reminded in Section UlI-AI yields 
that T/v converges in distribution to a standard Tracy-Widom random variable with c.d.f. Ftw 
which is continuous over M. Denote by the c.d.f. of T/v under Hq, then a classical result, 
sometimes called Polya's theorem (see for instance ll43l ). asserts that the convergence of F^ 
towards Ftw is uniform over M: 

sup iF^ix) - FTwix)\ > . (26) 

xm N,K^oo 

We are now in position to prove the theorem. 

The mere definition of (n implies that a = Pn^n) = Fn{Cn)- Due to (l26l) . pTwiCN) — ^ ct- 
As Ftw has a continuous inverse, the first point of the theorem is proved. 
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The second point is a direct consequence of the convergence of F/v toward the Tracy-Widom 
distribution: The PEA of test (l23l) can be written as: Pq (^fjy > F^^{a)j which readily converges 
to a. 

The third point is a direct consequence of (l26l) : PN{TN)—PNiTj^) = Fn{Tn)—Ftw{Tn) — ^ . 
This completes the proof of Theorem [TJ 



V. Asymptotic analysis of the power of the test 

In this section, we provide an asymptotic analysis of the power of the GLRT as N,K ^ oo. 
As the power of the test goes exponentially to zero, its error exponent is computed with the help 
of the large deviations associated to the largest eigenvalue of matrix R. The error exponent and 
error exponent curve are computed in Theorem [2l Section IV-Ai the large deviations of interest 
are stated in Section IV-BI Finally Theorem [2] is proved in Section IV-CI 



A. Error exponents and error exponent curve 

The most natural approach to characterize the performance of a test is to evaluate its power or 
equivalently its miss probability i.e., the probability under Hi that the receiver decides hypothesis 
Hq. For a given level a E (0, 1), the miss probability writes: 

= inf {Pi (Tjv < 7) , 7 such that Pq {Tn > 7) < «} . (27) 

7 

Based on Section ITl-B[ the infimum is achieved when the threshold coincides with 7 = p'^^(a); 
otherwise stated, (3N,T{ct) = Pi (T/v < p^^(a)) (notice that the miss probability depends on the 
unknown parameters h and cr^). As f^N^xioi) has no simple expression in the general case, we 
again study its asymptotic behaviour in the asymptotic regime of interest (fT3l) . It follows from 
Theorem [U that p^^(a) — ?■ A+ = (1 + -^/c)^ for a G (0, 1). On the other hand, under hypothesis 
Hi, T/v converges a.s. to A^j^ which is strictly greater than when the ratio ^^5- is large 
enough. In this case. Pi (Tn < P^^(a)) goes to zero as it expresses the probability that T/v 
deviates from its limit A^^; moreover, one can prove that the convergence to zero is exponential 
in N: 

Pi {Tn < x) oc e-^^'^(") for x < A~k , (28) 

June 1, 2010 DRAFT 



16 



where 1^ is the so-called rate function associated to T/v- This observation naturally yields the 
following definition of the error exponent 6^: 

St = lim log /3jv,t(«) (29) 

the existence of which is established in Theorem [2] below (as A^, A' — )■ oo). Also proved is the 
fact that £x does not depend on a. 

The error exponent 6^ gives crucial information on the performance of the test T/v, provided 
that the level a is kept fixed when A^, K go to infinity. Its existence strongly relies on the study 
of the large deviations associated to the statistics Tj^f. 

In practice however, one may as well take benefit from the increasing number of data not 
only to decrease the miss probability, but to decrease the PFA as well. As a consequence, it is 
of practical interest to analyze the detection performance when both the miss probability and 
the PFA go to zero at exponential speed. A couple (a, b) E (0, oo) x (0, oo) is said to be an 
achievable pair of error exponents for the test Tjy if there exists a sequence of levels a^v such 
that, in the asymptotic regime (fT3l) . 



1 1 
lim — — loga7v = a and lim ——\o^BNT{oiN) = b. (30) 

N,K^oo N N,K~>oo N 

We denote by St the set of achievable pairs of error exponents for test as A^, — > oo. We 
refer to §t as the error exponent curve of T/v. 

The following notations are needed in order to describe the error exponent £r and error 
exponent curve S^. 

f(x) = J^F^^idy) forxGM\(A-,A+) ^^^^ 

F+(x) = / log(x - y)¥y^p{dy) for x > A+ 

Remark 6. Function f is the well-known Stieltjes transform associated to Marcenko-Pastur 
distribution and admits a closed-form representation formula. So does function F+, although 
this fact is perhaps less known. These results are gathered in Appendix 

Denote by A( ■ | A) the convex indicator function i.e. the function equal to zero for x E A 
and to infinity otherwise. For p > -^c, define the function: 

I^ix) = ^^^-(l-c)logf^j-c(F+(x)-F+(A3^,))+A(a;|[A+,oo)).(32) 
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Also define the function: 

I+{x) =x-X+-{l-c) log - 2c (F+(x) - F+(A+)) + A(a: | [A+, oo)) . (33) 

We are now in position to state the main theorem of the section: 

Theorem 2. Let Assumption [7] hold true, then: 

1) For any fixed level a G (0, 1), the limit £r in ^29\) exists as N,K ^ oo and satisfies: 

£r = /;(A+) (34) 

if p > \fc and £t = otherwise. 

2) The error exponent curve of test Tjy is given by: 

§T = {(/o+(x),/;(x)) : xG(A+,A,^k)} (35) 
if P > y/c and = otherwise. 

The proof of Theorem [21 heavily relies on the large deviations of T/v and is postponed to 
Section IV-CI Before providing the proof, it is worth making the following remarks. 

Remark 7. Several variants of the GLRT have been proposed in the literature, and typically 
consist in replacing the denominator -^trR (which converges toward cr^j by a more involved 
estimate ofa^ in order to decrease the bias /f72l/. / IT?]/ . /f7?l/. / f75l/ . However, it can be established 
that the error exponents of the above variants are as well given by f [54l) and diTl) in the asymptotic 
regime. 

Remark 8. The error exponent Et yields a simple approximation of the miss probability in the 
sense that (3N,Ti<y) — e~^^^ as N ^ oo. It depends on the limiting ratio c and on the value 
of the SNR p through the constant A^j^. In the high SNR case, the error exponent turns out to 
have a simple expression as a function of p. If p ^ oo then A^j^ tends to infinity as well, which 
simplifies the expression of rate function J+. Using F"'~(A^]^) = logA^j^ + Op(l) where Op(l) 
stands for a term which converges to zero as p ^ oo, it is straightforward to show that for each 
X > A+, Ip{x) = logp — 1 — (1 — c) logx — cF~^{x) + Op(l). After some algebra, we finally 
obtain: 

£t = logp - (1 + v^) - (1 - c) log(l + v^) - clog + Op{l) . 
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At high SNR, this yields the following convenient approximation of the miss probability: 

/3^,t(«)^(^(c)p)^ , (36) 
where iIj{c) = e"(^+^)(l + ^/cy~^c~^. 

B. Large Deviations associated to T/v 

In order to express the error exponents of interest, a rigorous formalization of (l28l) is needed. 
Let us recall the definition of a Large Deviation Principle: A sequence of random variables 
(X7v)AreN satisfies a Large Deviation Principle (LDP) under P in the scale with good rate 
function / if the following properties hold true: 

• J is a nonnegative function with compact level sets, i.e. {x, I{x) < t} is compact for t G M, 

• for any closed set F C M, the following upper bound holds true: 

hm sup ^ log F{Xn G F) < - inf / . (37) 

Af->oo A F 

• for any open set G C M, the following lower bound holds true: 

lim inf ^ log P(X^ G G) > - inf / . (38) 

For instance, if A is a set such that infint(A) I = infci(yi) I{= inf a /), (where int(y4) and cl{A) 
respectively denote the interior and the closure of A), then (l37l) and (|38l) yield 

lim A^-^ logP(X;v G A) = - inf J . (39) 

N^oo A 

Informally stated, 

P(XAr G A) oc e-^''^^^^ as A^ ^ oo . 

If, moreover inf^ / > (which typically happens if the limit of -if existing- does not belong 
to A), then probability ¥(Xn G A) goes to zero exponentially fast, hence a large deviation (LD); 
and the event {Xj^ G A} can be referred to as a rare event. We refer the reader to |[44l for 
further details on the subject. 

As already mentioned above, all the probabilities of interest are rare events as A^, K go to 
infinity related to large deviations for T/v- More precisely. Theorem [2] is merely a consequence 
of the following Lemma. 

Lemma 1. Let Assumption [7] hold true and let N, K ^ oo, then: 



DRAFT 



June 1, 2010 



19 



1) Under Hq, T]^ satisfies the LDP in the scale N with good rate function Iq, which is 
increasing from to oo on interval [A^, oo). 

2) Under Hi and if p > \fc, T/v satisfies the LDP in the scale N with good rate function 
Ip. Function 1^ is decreasing from /^(A"*") to on [A''", A^J and increasing from to 
oo on [X^^^, oo). 

3) For any bounded sequence {r]N) n>o, 

ito _ 1 logP, (r„ <(i + ^f + ^)^i '''''^ '^^ > ^ (40) 



otherwise. 

4) Let X G (A"^, oo) and let {xiy)iy>o be any real sequence which converges to x. If p < \fc, 
then: 

lim -4 log Pi (T^ < x^) = . (41) 

The proof of Lemma [H is provided in Appendix lAl 

Remark 9. 1) The proof of the large deviations for relies on the fact that the denominator 
K^HrR of Tm concentrates much faster than Ai. Therefore, the large deviations of 
are driven by those of Ai, a fact that is exploited in the proof 

2) In Appendix^ we rather focus on the large deviations of \i under Hi and skip the proof 
of LemmaU\(l ), which is simpler and available (to some extent) in l[29\ Theorem 2.6. 6£_. 
Indeed, the proof of the LDP relies on the joint density of the eigenvalues. Under Hi, this 
joint density has an extra-term, the spherical integral, and is thus harder to analyze. 

3) LemmaUl-(3) is not a mere consequence of Lemma\l^(2) as it describes the deviations of 
Tjv at the vicinity of a point of discontinuity of the rate function. The direct application 
of the LDP would provide a trivial lower bound (—oo) in this case. 

4) In the case where the entries of matrix Y are real Gaussian random variables, the results 
stated in Lemma [7] will still hold true with minor modifications: The rate functions will be 
slightly different. Indeed, the computation of the rate functions relies on the joint density 
of the eigenvalues, which differs whether the entries of Y are real or complex. 



^see also the errata sheet for the sign error in the rate function on the authors webpage. 
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Plot of rate function 1^ 



Plot of rate function 1"^ 





3.05 3.1 



Figure 1. Plots of rate functions Iq and in the case where c = 0.5 and p = Idb. In this case, A"*" = 2.9142, A^j^ = 3, 
/o+(A+)=Oand/,+ (A,°^J=0. 



C. Proof of Theorem |2] 

In order to prove (l34l) . we must study the asymptotic behaviour of the miss probability 

I3n,t{o!) = Pi (Tn < p^^(a)) as A^, A' — oo. Using Theorem [I]-(l), we recall that 

/3^,t(«) = Pi (Tjv < (1 + V^f + ^) (42) 



where cn = % converges to c and where 1]^ is a deterministic sequence such that 



lim r/^ = (1 + + 1 F-^(a) 

N,K—>oo W'^ / 



Hence, Lemma [T]-(3) yields the first point of Theorem |2l We now prove the second point. Assume 
that p > a/c. Consider any x G A^j^) and for every A^, K, consider the test function which 
rejects the null hypothesis when > x, 



Hi 
Tn ^ X . 



(43) 



Denote by = "^oiT^ > x) the PFA associated with this test. By Lemma [I]-(l) together with 
the continuity of the rate function at x, we obtain: 



M ~ ^ "TV = inf V (y) = Iq (x) . 



(44) 
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The miss probability of this test is given by /3Ar,r(aAf) = '^liT^ < x). By Lemma [I]-(2), 

lim -llog/3;v,T(«^)= inf j;(y) = /;(x). (45) 

Equations (l44l) and (|45l) prove that (/q'"(x), /^(x)) is an achievable pair of error exponents. 
Therefore, the set in the righthand side of (l35l) is included in S^-. We now prove the converse. 
Assume that {a,b) is an achievable pair of error exponents and let ajv be a sequence such 
that (l30l) holds. Denote by = PN^{oiN) the threshold associated with level aN- As Io{x) is 
continuous and increasing from to oo on interval (A'^, oo), there exists a (unique) x E (A"*", oo) 
such that a = Iq{x). We now prove that converges to a; as tends to infinity. Consider a 
subsequence 7<^(Ar) which converges to a limit 7 G M U {00}. Assume that 7 > x. Then there 
exists e > such that 7<^(Ar) > x + e for large A^. This yields: 

^ log Po (T^(jv) > 7^(^)) > ^ log Po {T^iN) >x + e) . (46) 



Taking the limit in both terms yields Iq{x) > Iq{x + e) by Lemma [B which contradicts the 
fact that Iq is an increasing function. Now assume that 7 < x. Similarly, 

^ log Po (T^(7v) > l^iN)) < ^ log Po (T^(7V) > X - e) (47) 



for a certain e and for A^ large enough. Taking the limit of both terms, we obtain /o^(x) < 
Iq{x — e) which leads to the same contradiction. This proves that limjv77v = x. Recall that by 
definition (l30l) . 

b= lim -llogPi(T^ <7^) . 

N,K-^oo A 

As 7Ar tends to x. Lemma [U implies that the righthand side of the above equation is equal to 
Ip{x) > if X G (A+, A^j^) and p > ^^c. It is equal to if x > A^^ or p < y/c. Now 6 > by 
definition, therefore both conditions x G (A"*^, A^j^) and p > ^/c hold. As a conclusion, if (a, b) is 
an achievable pair of error exponents, then (a, b) = {Iq{x), I^{x)) for a certain x G (A"*", A^j^), 
and furthermore p > y/c. This completes the proof of the second point of Theorem [21 

VL Comparison with the test based on the condition number 

This section is devoted to the study of the asymptotic performances of the test Un = j^, 
which is popular in cognitive radio ll22ll . Il23l . [|24ll . The main result of the section is Theorem 
|3l where it is proved that the test based on T/v asymptotically outperforms the one based on 
in terms of error exponent curves. 
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A. Description of the test 

A different approach which has been introduced in several papers devoted to cognitive radio 
contexts consists in rejecting the null hypothesis for large values of the statistics Un defined by: 

Un = ^, (48) 

which is the ratio between the largest and the smallest eigenvalues of R. Random variable 
is the so-called condition number of the sampled covariance matrix R. As for T/v, an important 
feature of the statistics f/jv is that its law does not depend of the unknown parameter a which 
is the level of the noise. Under hypothesis Hq, recall that the spectral measure of R weakly 
converges to the Marcenko-Pastur distribution (fT4)) with support (A^, A+). In addition to the fact 
that Ai converges toward A+ under Hq and A^j^ under Hi, the following result related to the 
convergence of the lowest eigenvalue is of importance (see for instance [|45l . [|46l . BTI ): 

= a\l - ^cf (49) 

under both hypotheses and Ex. Therefore, the statistics admits the following limits: 

^^^^= il^^l' , and [/^^^ forp>y^. (50) 

A" (1 — ^cy Hx A~ 

The test is based on the observation that the limit of [/at under the alternative H\ is strictly 
larger than the ratio A"^/A", at least when the SNR p is large enough. 

B. A few remarks related to the determination of the threshold for the test Un 

The determination of the threshold for the test U m relies on the asymptotic independence of 
Ai and under Hq. As we shall prove below that test f/^v is asymptotically outperformed 
by test T/v, such a study, rather involved, seems beyond the scope of this article. For the sake 
of completeness however, we describe unformally how to set the threshold for Un- Recall the 
definition of A i in (fT6l) and let Ak be defined as: 

{Xk - (1 - v^)^) 



A^ = A^2/3. 



-1/2 _ , 



1/3 



{^/c^- l) (c} 

Then both Ai and converge toward Tracy-Widom random variables. Moreover, 



(Ai,A^)— — ^(x,r) , 
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where X and Y are independent random variables, both distributed according to -FthS. 

As a corollary of the previous convergence, a direct application of the Delta method ETl 
Chapter 3] yields the following convergence in distribution: 

2- 



where 

■^^^^)f' and 



which enables one to set the threshold of the test, based on the quantiles of the random variable 
aX + &F. In particular, following the same arguments as in Theorem [U-l), one can prove that 
the optimal threshold (for some fixed a E (0, 1)), defined by Fq(Un > 'Jn) = « , satisfies 



In particular, is bounded as N, K oo. 



C. Performance analysis and comparison with the GLRT 

We now provide the performance analysis of the above test based on the condition number 
[/at in terms of error exponents. In accordance with the definitions of section FV^Al we define the 
miss probability associated with test [/at as l3N,u{(y) = inf^ Pi (f/Ar < 7) for any level a e (0, 1), 
where the infimum is taken w.r.t. all thresholds 7 such that Pq (t^Af > 7) < ol. We denote by tu 
the limit of sequence —j^ log I3n,u{ol) (if it exists) in the asymptotic regime (fT3l) . We denote by $u 
the error exponent curve associated with test Un i-e., the set of couples (a, b) of positive numbers 
for which —-^ log /J^r (7(0;^?) — > 6 for a certain sequence which satisfies — logaAr — )■ a. 

Theorem |3] below provides the error exponents associated with test Un- As for Tjv, the 
performance of the test is expressed in terms of the rate function of the LDPs for U n under Pq 
or Pi. These rate functions combine the rate functions for the largest eigenvalue Ai, i.e. and 
Iq defined in Section IV-B[ together with the rate function associated to the smallest eigenvalue, 
defined below. As we shall see, the positive rank-one perturbation does not affect Xk whose 
rate function remains the same under Hq and Hi. 

*Such an asymptotic independence is not formally proved yet for R under Ho, but is likely to be true as a similar result has 
been established in the case of the Gaussian Unitary Ensemble 1471 . 1401 . 
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We first define: 



F-(x) = J hgiy - x)dFy,piy) for x < . (51) 



As for F"*", function F~ also admits a closed-form expression based on f, the Stieltjes transform 
of Marcenko-Pastur distribution (see Appendix O for details). 
Now, define for each x G M: 



/ (x) = X — \ 



;i - c) log (^) - 2c (F-(x) - F-(A-)) + A(x|(0, A"])- (52) 



If Ai and were independent random variables, the contraction principle (see e.g. ||44]| ) would 
imply that the following functions 

r,(t)= inf |/;(x) + /-(y) : - = t\ and ro(t) = inf (^(x) + /"(y) : - = t] 

defined for each t > 0, are the rate functions associated with the LDP governing Xi/X^ under 
hypotheses Hi and Hq respectively. Of course, Ai and Xk are not independent, and the contraction 
principle does not apply. However, a careful study of the p.d.f. p^^^^v ^^'^ Pk n shows that Ai 
and Xk behave as if they were asymptotically independent, from a large deviation perspective: 

Lemma 2. Let Assumption [7] hold true and let N, K oo, then: 

1) Under Hq, Un satisfies the LDP in the scale N with good rate function Tq. 

2) Under Hi and if p > \fc, U n satisfies the LDP in the scale N with good rate function 



3) For any bounded sequence {r]N) 



N>0> 



r 1] ^ frr . (1 + , W A I r,(A+) ifp>^C 

hm -— log Pi Un < jz \ — + ^TT^ = < (53) 

N V N^^J otherwise. 



Moreover, Tp{X+) = /+(A+). 
4) Let X G (A'*^, oo) and let {xn)n>o be any real sequence which converges to x. If p < -y/c, 
then: 

lim -i-logPi(T;v <a;7v) = (54) 

N,K->oo N 

Remark 10. In the context of Lemma [7] both quantities Xi and Xk deviate at the same speed, 
to the contrary of statistics T/v where the denominator concentrated much faster than the largest 
eigenvalue Xi. Nevertheless, proof of Lemma \2\ is a slight extension of the proof of Lemma [7] 
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based on the study of the joint deviations (Ai, Aj^), the proof of which can be performed similarly 
to the proof of the deviations of Xi. Once the large deviations established for the couple (Ai, A^:), 
it is a matter of routine to get the large deviations for the ratio Xi/X^. A proof is outlined in 
Appendix \B\ 

We now provide the main result of the section. 

Theorem 3. Let Assumption [7] hold true, then: 

1) For any fixed level a G (0, 1) and for each p, the error exponent S-u exists and coincides 
with S.T- 

2) The error exponent curve of test U n is given by: 

§u = |(ro(t),r,(t)) : tG (55) 

P > y/c and S[/ = otherwise. 

3) The error exponent curve §t of test Tj\f uniformly dominates Su in the sense that for each 
(a, b) G §>u there exits b' > b such that (a, b') G §t- 

Proof: The proof of items (1) and (2) is merely bookkeeping from the proof of Theorem [2] 
with Lemma |2] at hand. 

Let us prove item (3). The key observation lies in the following two facts: 

VxG(A+,A,^k), r,(^) = j;(x) , (56) 

Va;G(A+,A3^J, To (^) < I^{x) . (57) 

Recall that 

< I+(x) + I-(\-) = I*{x), 

where (a) follows from the fact that /"(A~) = and by taking u = x,v = X~. Assume that 
inequality (a) is strict. Due to the fact that is decreasing, the only way to decrease the value 
of I^{u) + I~{v) under the considered constraint ^ = is to find a couple (n, v) with u > x, 
but this cannot happen because this would enforce v > X^ so that the constraint f = remains 
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fulfilled, and this would end up with / (v) = oo. Necessarily, (a) is an equality and (|561 ) holds 
true. 

Let us now give a sketch of proof for (I57I) . Notice first that \u=x> (which easily follows 
from the fact that Iq is increasing and differentiable) while ^ \v/'x- = 0. This equality follows 
from the direct computation: 

I-(x) 1-c dF- 
lim ^ = 1 ^ 2c 



X /-X- X — \ X dx 

. 1 + 



xyx- 
2cf(A-) = 



l-Vc 

where the last equality follows from the fact that = — f together with the closed-form 
expression for f as given in Appendix O As previously, write: 

< /+(!) + /-(A-) = 

Consider now a small perturbation u = x — S and the related perturbation v = \^ — 5' so 
that the constraint - = ^ remains fulfilled. Due to the values of the derivatives of 1^ and /~ 

V X~ U 

at respective points x and A^, the decrease of Iq{x — 5) will be larger than the increase of 
/~(A^ — 5'), and this will result in the fact that 



ro(p) < U{x-5) + I-{\^ +5') < /, 





which is the desired result, which in turn yields (l57l) . 

We can now prove Theorem [3l-(3). Let (a, h) E §u and (a, b') G §>t, we shall prove that 
b < v. Due to the mere definitions of the curves St/ and S^^, there exist x E (A''",A^]^) and 
t E {X+/X^,X^JX-) such that a = /o+(x) = ro(t). Eq. ^ yields that ^ < t. As /+ is 
decreasing, we have 

b' = i;{x) > /;(tA-) = r,(t) = b , 

and the proof is completed. ■ 

Remark 11. Theorem \3^( 1 ) indicates that when the number of data increases, the powers of 
tests T/v and Un both converge to one at the same exponential speed Eu = £t, provided that 
the level a is kept fixed. However, when the level goes to zero exponentially fast as a function of 
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Log of the Error exponent for different values of c 
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Figure 2. Computation of the logarithm of the error exponent £ associated to the test Tm for different values of c (with £p 
defined for p > ^/c and Ep \p^^ = 0), and comparison with the optimal result (Neyman-Pearson) obtained in the case where 
all the parameters are perfectly known. 

the number of snapshots, then the test based on T/v outperforms Un in terms of error exponents: 
The power of Tjy converges to one faster than the power of U at. Simulation results for N, K 
fixed sustain this claim ( cf. Figure I?]). This proves that in the context of interest (N,K — )■ oo), 
the GLRT approach should be prefered to the test U at. 



In the following section, we analyze the performance of the proposed tests in various scenarios. 

Figure [2] compares the error exponent of test Tjy with the optimal NP test (assuming that all 
the parameters are known) for various values of c and p. The error exponent of the NP test can 
be easily obtained using Stein's Lemma (see for instance ||48l ). 

In Figure [H we compare the Error Exponent curves of both tests T/v and Un- The analytic 
expressions provided in [2] and [3] for the Error Exponent curves have been used to plot the curves. 
The asymptotic comparison clearly underlines the gain of using test Tjy. 



VII. Numerical Results 
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X 10 ^ Error Exponent Curves for Ti and T2 
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Figure 3. Error Exponent curves associated to tiie tests Tjv (Ti) and Un (T2) in the case where c — ^ and p = 10 dB. Each 
point of the curve corresponds to a given error exponent under Ho (X axis) and its counterpart error exponent under Hi (Y 
axis) as described in Theorem |2]-(2) for Tjv and Theorem |3]-(2) for Un- 

Finally, we compare in Figure |4] the powers (computed by Monte-Carlo methods) of tests 
T/v and Un for finite values of N and K. We consider the case where K = 10, N = 50 and 
p = 1 and plot the probability of error under Hq versus the power of the test, that is a versus 
Pi (Tat > In) (resp. Fi{Un > 'Jn)) where is fixed by the following condition: 

MTn >lN)=a (resp. Po(t/,v > In) = a) . 
VIII. Conclusion 

In this contribution, we have analyzed in detail the GLRT in the case where the noise variance 
and the channel are unknown. Unlike similar contributions, we have focused our efforts on the 
analysis of the error exponent by means of large random matrix theory and large deviation 
techniques. Closed-form expressions were obtained and enabled us to establish that the GLRT 
asymptotically outperforms the test based on the condition number, a fact that is supported by 
finite-dimension simulations. We also believe that the large deviations techniques introduced here 
will be of interest for the engineering community, beyond the problem addressed in this paper. 
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Figure 4. Simulated ROC curves for Tjv (test 1) and Un (test 2) in the case where K = 10, N = 50 and p = 10 dB. 
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Appendix A 
Proof of Lemma [U Large deviations for Tat 

The large deviations of the largest eigenvalue of large random matrices have already been 
investigated in various contexts, Gaussian Orthogonal Ensemble ll49l and deformed Gaussian 
ensembles [|2T|. As mentionned in [[2T1 Remark 1.2], the proofs of the latter can be extended to 
complex Wishart matrix models, that is random matrices R under Hq or Hi. 

In both cases, the large deviations of Ai rely on a close study of the density of the eigenvalues, 
either given by (fT2)) (under Hq) or by (fT9l) for the spiked model (under Hi). The study of the 
spiked model, as it involves the study of the asymptotics of the spherical integral (see Lemma |3] 
below), is more difficult. We therefore focus on the proof of the LDP under Hi (Lemma [I]-(2)) 
and omit the proof of Lemma [il-(l)- Once Lemma [I]-(2) is proved, proving Lemma dKl) is a 
matter of bookkeeping, with the spherical integral removed at each step. 

Recall that Ai > • ■ ■ > Aa' are the ordered eigenvalues of R and that is the statistics 
defined in Q. 

In the sequel, we shall prove the upper bound of the LDP in Lemma \l\-(2) (which gives also 
the upper bound in Lemma [I]-(3)). The proof of the lower bound in Lemma [il-(3) requires more 
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precise arguments than the lower bound of the LDP. One has indeed to study what happens at 
the vicinity of A+, which is a point of discontinuity of the rate function /+. Thus, we skip the 
proof of the lower bound of the LDP in Lemma \l\-(2) to avoid repetition. Note that the proof 
of Lemma [T]-(4) is a mere consequence of the fact that T/v converges a.s. to A"*" if p < y/c, thus 
Pi(T/v < xn) converges to 1 whenever converges to x > A"*". 

For sake of simplicity and with no loss of generality as the law of T/v does not depend on a, 
we assume all along this appendix that a"^ = 1. We first recall important asymptotic results for 
spherical integrals. 



A. Useful facts about spherical integrals 

Recall that the joint distributions of the ordered eigenvalues under hypothesis Hq and Hi 
are respectively given by (fT2l) and (fT9l ). In the latter, the so-called spherical integral (|20l) is 
introduced. We recall here results from f2l\ related to the asymptotic behaviour of the spherical 
integral in the case where one diagonal matrix is of rank one and the other has the limiting 
distribution P^p- We first introduce the function defined for x > A+ by: 

f - log (^) - F^(^^k), if p < and A+ < X < A^,, 

^ - 1 - log (^) - F^(a:), otherwise. 
Consider a J'^-tuple (xi,-- - ,xk) and denote by tt^^x = ■j^J2iL2^^2 the empirical dis- 
tribution associated to (x2, ■ ■ ■ ,xk)', let be a metric compatible with the topology of weak 
convergence of measures (for example the Dudley distance - see for instance [[50l ). A strong 
version of the convergence of the spherical integral in the exponential scale with speed A^, 



established in [|2T1 can be summarized in the following Lemma: 

Lemma 3. Assume that N^K ^ oo and ^ — c G (0, 1) and let Assumption [7] hold true. Let 
Xi > x-2 > ■ ■ ■ > xk > ^ and 6 > 0. If, for N large enough, \xi — x\ < 6 and (/(tt/^ x, ^mp) < 
then: 

log Ik (^^^k, - cJp(x) < 6, 

where Jp is given by (l58l) . Bx = diag [j^^, 0, . . . , and Xx = diag(xi, ■ ■ ■ , xk)- 

Recall that the spherical integral 1^, defined in (l20l) . appears in the joint density (fT9l) of 
the eigenvalues under Hi. Lemma [3] provides a simple asymptotic equivalent cJp{x) of the 
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normalized integral log Ik- Roughly speaking, this will enable us to replace Ik by the 
quantity Q-^^'^Jpi^) when establishing the large deviations of Ai, which rely on a careful study 
of density (fT9l) . 

B. Proof of Lemma \1\(2) 

In order to establish the LDP under hypothesis Hi and condition p > ^/c, (that is the bounds 
(|37l) and (1381) ). we first notice that intervals {x,x + 6) for x, 5 G IR+ form a basis of the topology 
of The LDP will be therefore a consequence of the following bounds: 

• (Exponential tightness) there exists a function / : — t- going to infinity at infinity 
such that for all A^, 

Pi (Ai > M) < e-^f^^''^ . (59) 

Condition (l59l) is technical (see for instance p4l Lemma 1.2.18]): Instead of proving 
the large deviation upper bound for every closed set, the exponential tightness (l59l) . if 
established, enables one to restrict to the compact sets. 

• (Upper bound) For any x, for any M such that < x < M, 

limlimsup ^ log Pi {x < < x + 6, Xi < M) < -/+(x) , (60) 

■5-1-0 N,K^oo N 

Due to the exponential tightness, it is sufficient to establish the upper bound for compact 
sets. As each compact can be covered by a finite number of balls, it is therefore sufficient 
to establish upper estimate (l60l) in order to establish the LD upper bound. 

• (Lower bound) For any x, 

lim lim inf — log Fi(x <Tn <x + S) > . (61) 

The fact that (|6T1) implies the LD lower bound (1381 ) is standard in LD and can be found in 

[|44l Chapter 1] for instance. 
As the arguments are very similar to the ones developed in [[2T]|. we only prove in detail the 
upper bound (l60l) . Proofs of (|59| ) and (|6TI) are left to the reader. 

The idea is that the empirical measure tik,\ '■= '^f=2 (of all but the largest eigenvalues) 
and the trace concentrate faster than the largest eigenvalue. In the exponential scale with speed 
N, 7rK,x and the trace can be considered as equal to their limit, respectively P^p and 1. In 
particular, the deviations of T^- arise from those of the largest eigenvalue and they both satisfy 
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the same LDP with the same rate function J+. We therefore isolate the terms depending on Ai 
and gather the others through their empirical measure nK,x- 

Recall the notations introduced in (fT2l) and ( fT9l ) and let x > A^, 6 > 0. Consider the following 
domain: 

D = Uxi,--- e [0,M]^, ^ e{x,x + S) 

[ Xi H hXK 

For N large enough: 

Fi{x <Tn <x + 5, Ai < M) = / dp],^j^{xi..K) 

Jd 

= f ^^^^-Nxi^{N-K)\ogxi^2{K-l)J\og{xi-u)dnK.^{u) 

K 



(. _ 1 n(A'-1){JV-1) 7 
l,-"- N) ^K-l,N-l 



l<i<j j=2 
1 ^^(A'-l){JV-l) ^0 



71 



where we performed the change of variables := j^Xi for z = 2 : /T, and the related 
modifications 7rA',x TrA'.y and X^- = diag (xi, ■ ■ ■ ■,^^y2)- Note also that strictly 

speaking, the domain of integration T> would express differently with the yj's and in particular, 
we should have changed constant M which majorizes the Xi's into a larger constant as the ?/j's 
can theoretically be slightly above M - we keep the same notation for the sake of simplicity. 
To proceed, one has to study the asymptotic behaviour of the normalizing constant: 

/ 1 WX-1)(7V-1) 

r ~ n) ^A'-1, 



,Af-l 



71 



which turns out to be difficult. Instead of establishing directly the bounds (|59l)-(l6T1), we proceed 
as in [I2TII and establish similar bounds replacing the probability measures Pi by the measures 
Qi defined as: 

^1 •= (, ^'M{/^-l)(iV-l)^l 

^K-l,N-l V n) 

and the rate function /+ by the function Gp defined by: 

Gp{x) = - (1 - c) logx - cF+(x) + c + clog { / ] 

l + p \c{l + p)J 
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for X > Notice that these positive measures Qi are not probability measures any more, and 
as a consequence, the function Gp is not necessarily positive and its infimum might not be equal 
to zero, as it is the case for a rate function. 
Writing the upper bound for Qi, we obtain: 



\i{x <Tn <x + 5, Xi< M) 



K 

where, for any compactly supported probability measure /x and any real number y greater than 
the right edge of the support of /x, 



^{y,CN,lj) = -y+{l- CN)\ogy + 2cn J log(y - X)diJ,{\). 

Let us now localise the empirical measure TtK,y around P^plzl and the trace around 1. The 
continuity and convergence properties of the spherical integral recalled in Lemma [3] yield, for 
K large enough: 

Qiix <Tj^ <x + 5 , \i< M) < [ ^ dxi I e-^*(^^''='^'*^.y)e^'=(''''(^i)+'^)dp°^„i N-iiVi-K) 



^^K^N^K^NM^ f dpl^ {y^,^l (62) 
Je.c 

with 

£:=|(y2,---,yi^)G[0,M]^^'-\ d(7r^,y,PMp)<^ and 1 ^ y ■ g [l - 5^, 1 + 5^] | . 

The second term in (l62l) is easily obtained considering the fact that all the eigenvalues are less 
than M so that for 1<]<K, \xi -Xj\< 2M, xf < M^^^ and {UXkU*)ii < M. Now, 



standard concentration results under Hq yield that: 



limsup logPo ( 7r,^,A ^ ^(Pmp, N^'^') or ^ V A, ^ [l - 5^ 1 + 6^]] 

N,K^oo I\ y J 



-OO. 



More precisely, one knows using llSTIl that the empirical measure Yl!j=2 close enough to 
its expectation and then using [[52ll one knows that the expectation is close enough to its limit 
Pj;;jp. The arguments are detailed in the Wigner case in [|2T|| and we do not give more details here. 



^Notice that if ttk.x. is close to Pj^p. so is 7rK,y due to the change of variable yi — j^^Xi 
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As Cat -> c for N,K ^ oo, c ^{y,c,fj,) is continuous and /i ^{y,c,fi) is lower 
semi-continuous, we obtain: 



limsup — logQi(a; < Ai < X + (5 , Ai < M) < sup {^{u,c, P^p) + cJp{u)) + 25. 



1 

N,K^cx} ^ u&[x,x+5] 

By continuity in u of the two involved functions, we finally get: 



limlimsup — logQi(x < Ai < X + (5 , Ai < M) < ^{x,c,Fy^p) + cJp{x) = Gp{x) , 

and the counterpart of Eq. ( |60l ) is proved for Qi and function Gp. The proof of the lower bound 
is quite similar and left to the reader. It remains now to recover (l60l) . As Pi is a probability 
measure and the whole space IR+ is both open and closed, an application of the upper and lower 
bounds for Qi immediately yields: 

1 

liminf — log ^'^ ,M Tn e M+) 

^K-1,N-1 n) 



r 1 1 ^K,N 

lim — log ■ 



'^K-1,N-1 \^ n) 

= -infG„. (63) 

R+ 

This implies that the LDP holds for Pi with rate function Gp — mi^+ Gp. 

It remains to check that = Gp — mi^+ Gp, which easily follows from the fact to be proved 
that: 

inf Gp{x) = GpiX^^^) . (64) 

We therefore study the variations of Gp over [A+,oo). Note that (F+)' = — f, and thus that 
G'p{x) = (1 + p)~^ — (1 — c)x^^ + cf(x). Function f being a Stieltjes transform is increasing 
for X > A"^, and so is G'^, whose limit at infinity is (1 + p)"^ Straightforward but involved 
computations using the explicit representation (l67l) for f yield that G'p{\'^^) = 0. Therefore, Gp 
is decreasing on [A+, A^j^] and increasing on [A^,^, oo), and (|64|) is proved. 

This concludes the proof of the upper bound in Lemma [T]-(2). The proof of Lemma [T]-(l) is 
very similar and left to the reader. 
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C. Proof of Lemma \1\(3) 

The proof of this point requires an extra argument as we study the large deviations of T/v 
near the point (1 + -^/c)^ where the rate function is not continuous. In particular, the limit 
(l53l) does not follow from the LDP already established. As we shall see when considering 
Pi [Tn < (1 + a/ctv)^ + VnN''^/^), the fact that the scale (iV^^/s^ ^^le same as the one of the 
fluctuations of the largest eigenvalue of the complex Wishart model is crucial. 

We detail the proof in the case when p > ^/c and, as above, consider the positive measures 
Qi. We need to prove that: 

liminf llogQi (Tn < (1 + + -r^) > -G,(A+), e R, (65) 

the other bound being a direct consequence of the LDP. As previously, we will carefully localize 
the various quantities of interest. Denote by gNiv) = (1 + v^c/v)^ + 1]^^"^^^ for ^/ £ R and by 
hN^r) = 1 — rA^^^/'^ for r > 0. Notice also that Ai < gM{r])hN{r) together with X]j=2 -^i 
/lAr(r) imply that T/v < gNiv)- We shall also consider the further constraints: 

5(Ar(r7 - l)/iiv(r) < Ai and A2 < 5'Ar(?7 - 2)/iAr(r) 

which enable us to properly separate Ai from the support of ttk,\- Now, with the localisation 
indicated above, we have for A^ large enough, 

Qi {Tn < gNiv)) > Qi (oNiv - l)hN{r) < Ai < gN{v)hN{r), 

1 ^ A 
——J2x^>hN{r), \2<gN{v-'^)hN{r), ttk^x e B{F^p, N-'/^) \ . 

j=2 ^ 

As previously, we consider the variables yj = j^Xj for 2 < j < K and obtain, with the help 
of Lemma [3l 

r9N{v)hNir) r 

Qi {Tn < gNiv)) > dxj e-'"'(.-uCM,^K,y)^Nci,M.,)-5)^p^^_^ ^_i(l/2:i^) 



J gN{v-i)hN(r) J? 

with 

K-l 



3" := < (1/2, ■■■,?//<)€ 



■ NgN{Tl-2) hNjr) 
N-1 

K 



J=2 
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Therefore: 

Qi {Tm < gNiv)) > hM{r) (g^ir^) - g^{r^ - 1)) e^(«^(^^)-25)p^ ^^^^^ . . . , Ax) G 3^) 

(recall that Gp{x) = $(x,c,Pmp) + cJp{x)). Now, as /lAr(r) (gNiv) - 9Niv - 1)) = (1 - 
rA^"2/3^jY-2/3^ contribution vanishes at the LD scale: 

lim ^ log (/i7v(r) (^7v(r/) - gwiv - 1))) = . 

It remains to check that Fq {{\2, ■ ■ ■ , \k) ^ 3^) is bounded below uniformly in N. This will 
yield the convergence of jj logPo ((A2, ■ ■ ■ , Xr) G 3^) towards zero, hence (l65l) . Consider: 

Po ((A2, ■ ■ ■ , AaO G 3^') < Po {^K,x i S(Pmp, iV"'/')) 



\ j=2 



We have already used the fact that the first term goes to zero when grows to infinity. Recall 
that the fluctuations of '^f=2 °f order ^, therefore the second term also goes to zero 

as we consider deviations of order N'"^^'^. Now, A^^/^(A2 — (1 + ^/cjv)^) converges in distribution 
to the Tracy-Widom law, therefore the last term converges to Ftw (^7 ~ 2 + r(l + a/c)^) < 1. 
This concludes the proof. 

Appendix B 

Sketch of proof for Lemma [2l Large deviations for Um 

As stated in Remark [lOl we shall first study the LDP for the joint quantity (Ai,A/^). The 
purpose here is to outline the following convergence: 

^logP(Ai G A,\k G B) > - inf - inf /"(x) , 

A^ ^ ' N.K^oo x&A ' yeB ^ ' 

which is an illustrative way, although informal to state the LDP for (Ai, A^) (see (l39l)). 

Consider the quantity P (Ai G (ai, /3i), G (aj^, I3k))- As we are interested in the deviations 
of Ai and A^^, the interesting scenario is A"*" ^ and A~ ^ (recall that A^ are 

the edgepoints of the support of Marcenko-Pastur distribution). More precisely, the interesting 
case is when the deviations of the extreme eigenvalue occur outside of the bulk: ai > A+ and 

^All the statements, computations and approximations below can be made precise as in the proof of Lemma [T] 



DRAFT 



June 1, 2010 



37 



< A^; such deviations happen at the rate e"^^™"'^*-. The case where the deviations would 
occur within the bulk is unlikely to happen because it would enforce the whole eigenvalues to 
deviate from the limiting support of Marcenko-Pastur distribution, which happens at the rate 
g-7v2xconst._ Denote hy A = and B = {aK,^- 



P (Ai G A, Xk e B) 

= I 1(Ai>...>Ak>0) n {Xi-Xjf 

^K,N JAxR('<-2)xB l<i<j<K 



j=i V / 



A 



K 



B 

^W — O W_9 / T r n_ -TT n C 



'K-2,N-2 

X 



^K,N J x^>X2>->XK j=2 j=2 ^K-2,N-2 2<i<j<K-l 



We shall now perform the following approximations: 



J2 ^ogixi - X,) ^ {K -2) I log(xi - x)Pmp( dx) = {K - 2)F+(xi) , 

X-1 „ 

^ log(x,- - Xk) ^ {K-2) log(x - xx)Pmp( dx) = {K - 2)F~{xk) 

K-l „ 

J^xj ^ {K-2) xF^p{dx) ={K-2), 



j=2 

K-\ 



i=2 



The three first approximations follow from the fact that -j^^ ^ 5^.. ~ Pmp, the last one from 
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Lemma [3] Plugging these approximations into the expression of P (Ai e A,Xk E B) yields: 
P(Ai e AAx e B) 

^2{K-2)F+(xi)^{N-K)logXi-Nxi^NcJp{xi) 



A 

X 



Jb 

/. K-1 N-K {N-2) 

^K-2,N-2 ^-2(K-2) / V\ Zi 



W^-^ n {Xi- XjfdX2:K-l 



71" / J.J.7 

^K,N Jxi>X2>->XK j=2 '^K-2,N-2 2<i<j<K-l 

As Xi > ai > A+ and xk < /^i^ < A^, the last integral goes to one as K, N ^ oo and: 
P(Ai eA^XkE B) 

dXi e-^(^^^F+(xi)-(l-f )logxi+xi-cJp(xi)) 



7° 

X ^1 e 



Recall that we are interested in the limit ^ log P (Ai E A, E B). The last term will account 
for a constant T (see for instance (|63l) ): 

\ ^K,N J N,K^oo 

The term ^^°g(^~^jy) within the exponential in the integral accounts for the interraction between 
Ai and Xk and its contribution vanishes at the desired rate. In order to evaluate the two remaining 
integrals, one has to rely on Laplace's method (see for instance [[531 ) to express the leading term 
of the integrals (replacing KN~^ by c below): 

dXi e-^(2cF+(3::i)-(l-c)logxi+a;i-cJp(xi)) ^ ^-N inf^^A{'ic¥+ {x)-{l-c)\ogx+x-cJ p{x)) 
dXK e-^(2cF-(2:K)-(l-c)logXK+a;x-cJp(xK)) ^ ^-Af inf„es(2cF-(y)-(l-c) logy+y) 

B 

Finally, we get the desired limit: 

^ log P { Ai G A, Aa' G 5} -— — ^ - inf (a;) - inf $^ (y) + T , 

iV Af,A— >oo x&A y£B 
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where 



= 2cF+(x) - (1 - c) logx + z - cJp(x) 
$"(?/) = 2cF~{y) - {l-c)\ogy + y . 



It remains to replace Jp by its expression (1581) and to spread the constant T over and $~, 
which are not a priori rate functions (recall that a rate function is nonnegative). If A" G B, then 
the event {Xk G B} is "typical" and no deviation occurs, otherwise stated, the rate function 
/- should satisfy /"(A") = 0. Similarly, /o+(A+) = under Hq and /+(A^k) = under Hi. 
Necessarily, T should write T = $(A") + $(A+) under Hq (resp. T = $(A") + $(A^j,) under 
Hi) and the rate functions should be given by: = — $(A^), Iq = — $(A~'^) under Hq 
(resp. = — $(A^]^) under Hi), which are the desired results. 

We have proved (informally) that the LDP holds true for (Ai, Ak) with rate function I^jpix) + 
I~{y). The contraction principle [|44l Chap. 4] immediatly yields the LDP for the ratio ^ with 
rate function: 

To/pit) = inf {/o+^(x) + riy)} , (66) 

which is the desired result. We provide here intuitive arguments to understand this fact. 

For this, interpret the value of the rate function /^(x) as the cost associated to a deviation 
of Ai (under Hi) around x: P{Ai G (x, x + dx)} ~ e~^^^^^\ If a deviation occurs for the ratio 
say ^ G + dt) where t > -j^ (which is the typical behaviour of Un under Hi), then 
necessarily Ai must deviate around some value ty, so does Ax around some value y, so that the 
ratio is around t. In terms of rate functions, the cost of the joint deviation (Ai ^ ty, Xk ~ y) is 
I^{ty) + I^iv)- The true cost associated to the deviation of the ratio will be the minimum cost 
among all these possible joint deviations of Ai and Ax, hence the rate function (|66l) . 

Appendix C 

Closed-form expressions for functions f , F+ and F~ 

Consider the Stieltjes transform f of Marcenko-Pastur distribution: 

^^^^^ /- PmpW 
J X — z 

We gather without proofs a few facts related to f , which are part of the folklore. 
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Lemma 4 (Representation of f ). The following hold true: 

1) Function f is analytic in C — [A", A^]. 

2) IfzeC- [A-, A+] with ^{z) > ^i±^, then 

^_ {l-z-c) + y/il-z-cr-Acz 
~ 2cz 

where a/z stands for the principal branch of the square-root. 

3) //z e C - [A-, A+] with ^{z) < then 

_ {1-z-c)- ^y{l-z-cy- Acz 
~ 2cz 

where -~\fz stands for the branch of the square-root whose image is {z G C, ^{z) < 0}. 

4) As a consequence, the following hold true: 



f(x) = if x> , (67) 

„, , (1 — X — c) — — X — c)^ — Acx 

f x = ^ ^ if < X < A" . (68) 

2cx 

5) Consider the following function f (z) = cf (2) — Functions f and f satisfy the following 
system of equations: 

} ' , (69) 

^(^) = ~ z{l+cf[z)) 

Recall the definition (|3TI) and (ISTI) of function F"*" and F~. In the following lemma, we provide 
closed-form formulas of interest. 

Lemma 5. The following identities hold true: 

1) Let X > ^ then 

F+(x) = log(x) + - log(l + cf (x)) + log(l + f (x)) + xf (x)f (x) . 

c 

2) Let < X < A^, then 

F~(x) = log(x) + ■^log(l + cf(x)) +log(-(l + f(x))) + xf(x)f(x) . 
Proof: Consider the case where x > A~^. First write 



log(x - y) = log(x) + / (-H 



du . 
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Integrating with respect with Pj^p and applying Funini's theorem yields: 

log(x - y)PMp( dy) = log(x) + ^ (^^ + f (n)^ du 

in the case where x > A+. Recall that f and f are holomorphic functions over C— ({0}U [A^, A+]) 
and satisfy system (l69l) (notice in particular that 1 + cf and 1 + f never vanish). Using the first 
equation of (|69l ) implies that: 

//•oo 
log(x - y)¥^pidy) = log(x) - J f (M)f (m) rfw . (70) 

Consider T{u, f , f ) = ^ log(l + cf ) + log(l + f ) + uii. By a direct computation of the derivative, 
we get: 

= f(u){(u) . 



Hence 



oo 

i{u)i{u) du 



X 



-log(l + cf)+log(l + f) + uff 

c 

- log(l + cf (x)) + log(l + f (x)) + xf (x)f (x^ 
c 



It remains to plug this identity into (|70|) to conclude. The representation of F can be established 
similarly. 
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